A Simple Focused Crawler
نویسندگان
چکیده
منابع مشابه
Prioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملPriority based Semantic Web Crawler
The Internet has billions of web pages and these web pages are attached to each other using URL(Uniform Resource Allocation). Web crawler is a main module of Search engine that gathers these documents from WWW. Most of the web pages present on Internet are active and changes periodically. Thus, Crawler is required to update these web pages to update database of search engine. In this paper, pri...
متن کاملA Novel Architecture of Mercator: A Scalable, Extensible Web Crawler with Focused Web Crawler
This Paper described A Novel Architecture of Mercator: A Scalable, Extensible Web Crawler with Focused Web Crawler. We enumerate the major components of any Scalable and Focused Web Crawler and describe the particular components used in this Novel Architecture. We also describe this Novel Architecture support for Extensibility and downloaded user’s support information. We also describe how the ...
متن کاملReinforcement Learning with Classifier Selection for Focused Crawling
Focused crawlers are programs that wander in the Web, using its graph structure, and gather pages that belong to a specific topic. The most critical task in Focused Crawling is the scoring of the URLs as it designates the path that the crawler will follow, and thus its effectiveness. In this paper we propose a novel scheme for assigning scores to the URLs, based on the Reinforcement Learning (R...
متن کاملA Website Model-Supported Focused Crawler for Search Agents
This paper advocates the use of ontology-supported website models to provide a semantic level solution for a search agent so that it can provide fast, precise, and stable search results. We have based on the technique to develop a focused crawler, which can benefit both user requests and domain semantics. Equipped with this technique, our focused crawler manifests the following interesting feat...
متن کامل